Chester County
Neural network interpretability with layer-wise relevance propagation: novel techniques for neuron selection and visualization
Bhati, Deepshikha, Neha, Fnu, Amiruzzaman, Md, Guercio, Angela, Shukla, Deepak Kumar, Ward, Ben
Interpreting complex neural networks is crucial for understanding their decision-making processes, particularly in applications where transparency and accountability are essential. This proposed method addresses this need by focusing on layer-wise Relevance Propagation (LRP), a technique used in explainable artificial intelligence (XAI) to attribute neural network outputs to input features through backpropagated relevance scores. Existing LRP methods often struggle with precision in evaluating individual neuron contributions. To overcome this limitation, we present a novel approach that improves the parsing of selected neurons during LRP backward propagation, using the Visual Geometry Group 16 (VGG16) architecture as a case study. Our method creates neural network graphs to highlight critical paths and visualizes these paths with heatmaps, optimizing neuron selection through accuracy metrics like Mean Squared Error (MSE) and Symmetric Mean Absolute Percentage Error (SMAPE). Additionally, we utilize a deconvolutional visualization technique to reconstruct feature maps, offering a comprehensive view of the network's inner workings. Extensive experiments demonstrate that our approach enhances interpretability and supports the development of more transparent artificial intelligence (AI) systems for computer vision applications. This advancement has the potential to improve the trustworthiness of AI models in real-world machine vision applications, thereby increasing their reliability and effectiveness.
A Tiered GAN Approach for Monet-Style Image Generation
Neha, FNU, Bhati, Deepshikha, Shukla, Deepak Kumar, Amiruzzaman, Md
Generative Adversarial Networks (GANs) have proven to be a powerful tool in generating artistic images, capable of mimicking the styles of renowned painters, such as Claude Monet. This paper introduces a tiered GAN model to progressively refine image quality through a multi-stage process, enhancing the generated images at each step. The model transforms random noise into detailed artistic representations, addressing common challenges such as instability in training, mode collapse, and output quality. This approach combines downsampling and convolutional techniques, enabling the generation of high-quality Monet-style artwork while optimizing computational efficiency. Experimental results demonstrate the architecture's ability to produce foundational artistic structures, though further refinements are necessary for achieving higher levels of realism and fidelity to Monet's style. Future work focuses on improving training methodologies and model complexity to bridge the gap between generated and true artistic images. Additionally, the limitations of traditional GANs in artistic generation are analyzed, and strategies to overcome these shortcomings are proposed.
From classical techniques to convolution-based models: A review of object detection algorithms
Neha, Fnu, Bhati, Deepshikha, Shukla, Deepak Kumar, Amiruzzaman, Md
Object detection is a fundamental task in computer vision and image understanding, with the goal of identifying and localizing objects of interest within an image while assigning them corresponding class labels. Traditional methods, which relied on handcrafted features and shallow models, struggled with complex visual data and showed limited performance. These methods combined low-level features with contextual information and lacked the ability to capture high-level semantics. Deep learning, especially Convolutional Neural Networks (CNNs), addressed these limitations by automatically learning rich, hierarchical features directly from data. These features include both semantic and high-level representations essential for accurate object detection. This paper reviews object detection frameworks, starting with classical computer vision methods. We categorize object detection approaches into two groups: (1) classical computer vision techniques and (2) CNN-based detectors. We compare major CNN models, discussing their strengths and limitations. In conclusion, this review highlights the significant advancements in object detection through deep learning and identifies key areas for further research to improve performance.
Distributed Swarm Intelligence
Kanjula, Karthik Reddy, Kolla, Sai Meghana
This paper presents the development of a distributed application that facilitates the understanding and application of swarm intelligence in solving optimization problems. The platform comprises a search space of customizable random particles, allowing users to tailor the solution to their specific needs. By leveraging the power of Ray distributed computing, the application can support multiple users simultaneously, offering a flexible and scalable solution. The primary objective of this project is to provide a user-friendly platform that enhances the understanding and practical use of swarm intelligence in problem-solving.
Light in the Larynx: a Miniaturized Robotic Optical Fiber for In-office Laser Surgery of the Vocal Folds
Chiluisa, Alex J., Pacheco, Nicholas E., Do, Hoang S., Tougas, Ryan M., Minch, Emily V., Mihaleva, Rositsa, Shen, Yao, Liu, Yuxiang, Carroll, Thomas L., Fichera, Loris
This letter reports the design, construction, and experimental validation of a novel hand-held robot for in-office laser surgery of the vocal folds. In-office endoscopic laser surgery is an emerging trend in Laryngology: It promises to deliver the same patient outcomes of traditional surgical treatment (i.e., in the operating room), at a fraction of the cost. Unfortunately, office procedures can be challenging to perform; the optical fibers used for laser delivery can only emit light forward in a line-of-sight fashion, which severely limits anatomical access. The robot we present in this letter aims to overcome these challenges. The end effector of the robot is a steerable laser fiber, created through the combination of a thin optical fiber (0.225 mm) with a tendon-actuated Nickel-Titanium notched sheath that provides bending. This device can be seamlessly used with most commercially available endoscopes, as it is sufficiently small (1.1 mm) to pass through a working channel. To control the fiber, we propose a compact actuation unit that can be mounted on top of the endoscope handle, so that, during a procedure, the operating physician can operate both the endoscope and the steerable fiber with a single hand. We report simulation and phantom experiments demonstrating that the proposed device substantially enhances surgical access compared to current clinical fibers.
Deep Graph Random Process for Relational-Thinking-Based Speech Recognition
Huang, Hengguan, Xue, Fuzhao, Wang, Hao, Wang, Ye
Lying at the core of human intelligence, relational thinking is characterized by initially relying on innumerable unconscious percepts pertaining to relations between new sensory signals and prior knowledge, consequently becoming a recognizable concept or object through coupling and transformation of these percepts. Such mental processes are difficult to model in real-world problems such as in conversational automatic speech recognition (ASR), as the percepts (if they are modelled as graphs indicating relationships among utterances) are supposed to be innumerable and not directly observable. In this paper, we present a Bayesian nonparametric deep learning method called deep graph random process (DGP) that can generate an infinite number of probabilistic graphs representing percepts. We further provide a closed-form solution for coupling and transformation of these percept graphs for acoustic modeling. Our approach is able to successfully infer relations among utterances without using any relational data during training. Experimental evaluations on ASR tasks including CHiME-2 and CHiME-5 demonstrate the effectiveness and benefits of our method.
Going to Market - Radiology Today Magazine
App marketplaces are bridging the gap between AI creators and users. Words such as "democratize" and "ecosystem" are making their way from government and biology textbooks into the radiology lexicon, as a community of developers, engineers, and clinicians are interacting to establish AI marketplaces. These electronic storefronts are gathering places where vendors and consumers can share ideas and technology to create and use a variety of AI imaging tools. Within an AI marketplace, radiologists have equal opportunity to pick and choose from any of the available apps. They can also share feedback with developers as to how an app worked for them or even assist in the development of new apps customized for their clinical practices. And, they can be active participants in the development process without any knowledge of writing code. "With AI, there's a whole ecosystem of participants--industry leaders, health care startups, and research institutions--getting involved in creating apps and marketplaces where they can be made available to everyone," says Abdul Hamid Halabi, director of healthcare with NVIDIA.
Behavioral-clinical phenotyping with type 2 diabetes self-monitoring data
Levine, Matthew E., Albers, David J., Burgermaster, Marissa, Davidson, Patricia G., Smaldone, Arlene M., Mamykina, Lena
Words: 4252 Keywords: self-monitoring data, type 2 diabetes, machine learning, phenotyping, precision medicine ABSTRACT Objective: To evaluate unsupervised clustering methods for identifying individual-level behavioral-clinical phenotypes that relate personal biomarkers and behavioral traits in type 2 diabetes (T2DM) self-monitoring data. Materials and Methods: We used hierarchical clustering (HC) to identify groups of meals with similar nutrition and glycemic impact for 6 individuals with T2DM who collected self-monitoring data. We evaluated clusters on: 1) correspondence to gold standards generated by certified diabetes educators (CDEs) for 3 participants; 2) face validity, rated by CDEs, and 3) impact on CDEs' ability to identify patterns for another 3 participants. Results: Gold standard (GS) included 9 patterns across 3 participants. Of these, all 9 were rediscovered using HC: 4 GS patterns were consistent with patterns identified by HC (over 50% of meals in a cluster followed the pattern); another 5 were included as subgroups in broader clusers. After reviewing clusters, CDEs identified patterns that were more consistent with data (70% reduction in contradictions between patterns and participants' records). Discussion: Hierarchical clustering of blood glucose and macronutrient consumption appears suitable for discovering behavioral-clinical phenotypes in T2DM. Most clusters corresponded to gold standard and were rated positively by CDEs for face validity. Cluster visualizations helped CDEs identify more robust patterns in nutrition and glycemic impact, creating new possibilities for visual analytic solutions. Conclusion: Machine learning methods can use diabetes self-monitoring data to create personalized behavioral-clinical phenotypes, which may prove useful for delivering personalized medicine.